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ABSTRACT 


The purpose of this review is to examineand assess the obstacles that have been encountered in the research and development of 
OCR systems for machin evision jobs in the processing of medical image modalities. In this regard, this study summarizes a 
planning appropriate of OCR ,identifies challenges in the design of character recognition models in medical image modalities, 
with special attention to text data in these images, and, finally, provides benefits of this technology in the health system, as well 
as possible future research directions. 
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1. INTRODUCTION 

In general, we notice that the people who require 
medicines the most, such as older persons, will be 
unable to understand medical terms. In such a case, 
OCR may be able to help. It can detect text using OCR 
without any language barriers. OCR is a technique for 
converting images (.jpeg,.jpg,.png, format and so on) 
into readable text documents. OCR is used in a variety 
of industries, including data entry, education, and 
security (password identification, number plate 
recognition, and so on). We employed this in the 
medical field in our work. [1] OCR applications use a 
variety of techniques, but most focus on one character, 
phrase, or group of characters at a time. Characters are 
then identified using one of two algorithms: OCR 


applications are fed samples of text in a variety of fonts 


and image formats, which are then compared and 
recognized by the characters in the scanned page. 
Feature detection: To recognize characters in a scanned 
document, OCR applications use rules based on the 
attributes of a single letter or number. For example, the 
amount of angled lines, crossing lines, or curves in a 
character could be used as a comparison feature. For 
example, the capital letter "A" may be stored as three 
diagonal lines intersecting in the middle with a 
horizontal line. When a character is recognized, it is 
transformed into an ASCII code that computer systems 
can utilize to perform additional operations. It can 
extract the name of the medicine using OCR and Text 
summarization and show information such as name, 
usage, dose, and so on the application [2]. We 


experimented with various OCR techniques and 
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evaluated their performance. Optical Character 
Recognition has an accuracy of roughly 80% in general. 
Text extraction is still an open research topic, with a lot 


of work being done in this area. 


2. RELATED WORK 


OCR is the ability of a machine of images of typed or 
printed text into machine-generated text, whether from 
a scanned document or image, or from any previously 
saved image. The accuracy of word recognition from 
backdrop images is a major challenge with OCR. 
Researchers from all over the world have been working 
to increase OCR's efficiency. Creating a library of 
necessary information and comparing post-OCR output 
to the database to identify required output is an efficient 
way. Character recognition is not a new issue; in fact, its 
roots may be traced back to systems that existed before 
computers were invented. The first OCR systems just 
weren't computers, but mechanical machines that could 
recognize characters at a snail's pace and with poor 
precision. M. Sheppard invented the GISMO reading 
and robot in 1951, which is considered the first work on 
contemporary OCR [1]. GISMO is capable of reading 
musical notation and also words on a printed page one 
at a time. It can only recognize 23 characters. A 
typewritten page can also be copied using the machine. 
In 1954, J. Rainbow invented a machine that reads 
uppercase typeset English characters one at a time. Due 
to inaccuracies and sluggish recognition speed, early 
OCR technologies were panned. As a result, little 
research was done on the subject throughout the 1960s 
and 1970s. 


enterprises such as banks, media, and airlines were the 


Government institutions and huge 
only ones to see changes. Because of the complexity of 
recognition, it was decided that three standardized OCR 
fonts should be created to make the process of OCR 
recognition easier. As a result, in 1970, ANSI and EMCA 
established OCRA and OCR B, which provided 
comparable recognition rates [2]. OCR has been the 
subject of extensive investigation for the past thirty 
years. Document image analysis (DIA), multi-lingual, 
handwritten, and omni-font OCRs have all resulted as a 
result of this [2]. Despite all of these advances, the 
machine's ability to dependably comprehend text is still 
significantly inferior to that of a human. As a result, 


current OCR research focuses on improving OCR 


accuracy and speed for a variety of document styles 


printed/written in unconstrained situations. For 
difficult languages like Urdu or Sindhi, there hasn't 
been any free software or commercial software 
accessible. In the recent decade, the number of mobile 
phone users has exploded all over the world. The rising 
adoption of mobile-based applications around the 
world is due to easy and affordable internet 
connectivity. As a result, mobile applications have 
proven to be an efficient means of communicating 
information. Another method to provide customers a 
tailored mobile experience is to include a preferred 
language in the app. This was accomplished using 
Google Translate or the building of language-specific 
databases. [4] The 


traditionally played an important part in the nation's 


pharmaceutical industry has 
development. However, several studies have revealed 
that the general public lacks understanding about 
prescription drugs (use, dosage, precautions, and 
adverse effects), therefore showing such information 
through a mobile-based application would be extremely 
beneficial [5]. 


3. TYPES OF OPTICAL 
RECOGNITION SYSTEMS 
During the last few years, OCR research has gone in a 


CHARACTER 


variety of ways. The numerous types of OCR systems 
that have arisen as a result of these studies are discussed 
in this section. These systems can be classified 
depending on image capture mode, character 
connection, font constraints, and so on. The text 
recognition system is classified in Figure 1. 
Handwritten recognition & machine printed character 
recognition are two types of OCR systems based on the 
type of input. The former is a much easier problem to 
solve because letters are usually of consistent size and 
their places on the page can be predicted.[3]. Due to the 
user's 


diverse writing style and different pen 


movements for the same character, handwriting 
recognition system is a difficult task. On-line and 
off-line systems are the two types of systems that can be 
found. Whereas the users are writing the character, the 
former is done in real-time. They are less complicated 
since they can capture temporal or time-based 
information such as speed, velocity, the number of 
strokes made, the direction in which the strokes are 


written, and so on. Furthermore, because the pen's trace 
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is only a few pixels broad, no thinning procedures are 
required. The offline identification systems work with 
static data, i.e. a bitmap as input. As a result, 
performing recognition is quite challenging. Many 
online systems have been offered since they are easier to 
create, have high accuracy, and can be used to input 
data on tablets and PDAs [4]. 


4, APPLICATIONS OF OCR 


OCR allows for a wide range of applications. OCR has 
been used for mail sorting, bank check reading, and 
signature verification since its inception [5]. 
Furthermore, organizations can employ OCR for 
automated form processing in situations where a large 
amount of data is available in printed form. Processing 
utility bills, passport validation, pen computing, and 
automatic number plate identification are some of the 
other applications of OCR [6]. Helping blind and 
visually challenged persons read text is another useful 


application of OCR. 


5. METHODOLOGY 

OCR is a multi-phased activity. The following are the 
phases: Image acquisition is the process of capturing an 
image from an external source, such as a scanner or a 
camera. Preprocessing: After the image has been 
captured, various preprocessing operations can be 
carried out to increase the image's quality. Noise 
removal, thresholding, and image basis line extraction 
are some of the different preprocessing approaches. 
Character segmentation is the process of separating the 
characters in an image so that they may be submitted to 
a recognition engine. Connected component analysis 
and projection profiles are two of the most basic 
methodologies. However, in more complicated cases, 
such as when the characters are overlapping/broken or 
there is significant noise in the image, Advanced 
character segmentation algorithms are utilized in these 
instances. 

Feature extraction: After the characters have been 
divided, they are processed to extract various features. 
Characters are recognized based on these 
characteristics. Moments, for example, are one form of 
feature that may be retrieved from photos. The retrieved 


features should be computationally efficient, with 


minimal intra-class variation and maximum inter-class 
variance. 

Character classification: This stage assigns different 
categories or classes to the features of the segmented 
image. Character classification approaches come in a 
variety of forms. Structural classification approaches 
identify characters using distinct decision rules based 
on information collected from the visual structure. To 
classify the characters, statistical pattern classification 
approaches use probabilistic models and other 
statistical methods. 

Processing: The results, especially for complicated 
languages, are not always accurate after classification. 
OCR 


post-processing techniques can be used. To repair flaws 


To improve the accuracy of systems, 
in OCR output, these systems use natural language 
processing, geometric, and linguistic context. To boost 
accuracy, a post processor can use a spell checker & 
dictionary, as well as reinforcement learning like 
Markov chains and n-grams. A post processor's time 
and space complexity should be minimal, and its use 
should not result in the creation of additional errors. 
Image Acquisition: The first step in OCR is image 
acquisition, which entails collecting a digital image and 
putting it into a format that can be processed by a 
computer. This can include both picture quantization 
and compression [8]. Binarization, which involves only 
two levels of picture, is a specific case of quantization. 
In the vast majority of circumstances, the binary image 
is sufficient to describe the image. Lossy or lossless 
compression can be used [9] provides a summary of 
different image compression algorithms. 

After 
pre-processing is used to improve the image quality. 


Pre-processing: picture acquisition, 
Thresholding is a pre-processing technique that seeks to 
binarize a picture based on a threshold value [9]. The 
value of the threshold can be specified at the local or 
global level. A variety of filters, including average, min, 
and max filters, can be used. Different morphological 
operations, such as erosion, dilation, opening, and 
closing, can also be carried out. 

Character Segmentation: Before moving on to the 
classification phase, the image is split into characters. As 
a result of the classification phase, segmentation can be 
done explicitly or implicitly [11]. In addition, the other 
stages of OCR can aid in image segmentation by 


supplying contextual information. 
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Character Feature Extraction: At this stage, numerous 


character features are extracted. Characters are 
distinguished by these characteristics. An important 
research issue is how to choose the proper 
characteristics and the total amount of features to use. 
Various sorts of characteristics can be employed, 
including the image itself, geometric features (loops, 
strokes), and statistical features (moments). Finally, 
principal component analysis and other approaches can 
be employed to minimize the image's dimensionality. 
The process of putting a character into the right 
category is known as classification. The systematic 
approach to categorization is based on picture 
component connections. To classify the image, the 
statistical approaches rely on the usage of a 
discriminating function. Bayesian classifiers, decision 
tree classifiers, neural network classifiers, and closest 
neighbor classifiers are examples of statistical 
classification systems [10]. Finally, there really are 
classifiers that use a syntactic method to assemble a 
picture from its sub-components, which assumes a 
grammatical approach. 

Post-processing: There are several ways that can be 
utilized to increase the accuracy of OCR findings once 
the letter has been classified. Using more than one 
classifier for picture classification is one of the ways. It's 
possible to employ the classifier in a cascade, parallel, or 
hierarchical method. The classifiers' outputs can then be 
mixed in a variety of ways. Contextual analysis is 
another option for improving OCR results. The image's 
geometrical & document context can help reduce 
inaccuracies. Lexical processing using Markov models 


and dictionaries can also aid improve OCR results. 


Input document 
Jigitization 


— 


Noise clearing, text box detection & 


De-skewing 


Line, word, character segmentation 


Feature detection 


Error correction 


Fig 1: Flowchart of proposed method 


6. RESULT 

The indexing of medical records using the approach we 
have suggested, which combines textual description 
and visual description, is the first stage of our 
investigation. In fact, we used computerized medical 
annotation to show the value of our technique. The 
primary goal of an OCR system built on a grid 
architecture is to process electronic document formats 
that were previously only available in paper formats 
more effectively and efficiently. Compared to other 
available character recognition techniques, this 
increases the accuracy of character recognition during 
document processing. Here, OCR technique derives the 
words' meanings. As a result, our suggested approach 
performs better when evaluating the answer scripts. 
When compared to the current system, the text is 
identified and extracted in this instance more quickly. 
Comparing similarities between answers retrieved and 
reference summaries provided into the database is also 
more accurate, performing with an accuracy rate of 


85%. 


© set rextrrommmage(“test.pne") 


Policy for the administration of medicines 


Fig2.InputDocument 


Fig 3. Output OCR Document 


BioNLP/NLPBA 2004 shared corpus for the experiment. 
In this experiment, we compare the performance of 
RNN and CRFs with word embedding. For the baseline, 
only n-Gram (unigram, bigram, trigram) features of 
CRFs are utilized. The RNN and 
Elman-type RNN are compared, and at the same time, 
Word2Vec, GloVector, and CCA of the CRFs are also 
compared [11]. 


Jordan-type 
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We use the F1 score as the performance measurement. 


The F1 score is calculated by the following expression: 


Flscore=2*precision*recall/precision + 
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Figures 4, 5and 6 show the experimental results of Fig7:Outputscreenofsystem 


each word embedding method with various dimensions 
and window sizes. 8. CONCLUSION 


An overview of several OCR approaches is offered in 


7. OUTPUT this publication. OCR is a multi-phased procedure that 
The following are the results of my work which is includes acquisition, pre-processing, segmentation, 
carried out during processing the work. feature extraction, classification, and post-processing. 


This paper goes over each stage in great detail. As a 
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future project, an efficient OCR system could be 
constructed using a mix of these strategies. The OCR 
system can be utilized in a variety of real-time 
applications, including number plate recognition, smart 
libraries, and other real-time applications. The 
classification of medical data is a delicate area of 
research. It has to do with how people live. As a result, 
the research conducted in this area must be precise and 
have high rates of accuracy. It is clear that additional 
research and development in this area is necessary. The 
suggested method in this study uses the test results 
from a medical check-up as input and uses OCR to 
extract the text from the medical reports. It then selects 
features using the Bag of Words (BOW) algorithm is a 
method to extract features from text documents and 
does classification. The databases included various 
patient test results from medical procedures. When it is 
trained and evaluated against various datasets that 
included medical testing, the outcome is obtained. In 
order to establish a method for accurately identifying 
diseases using medical check-up test data and their 
interpretation, this research concludes by employing 
OCR. 


Conflict of interest statement 
Authors declare that they do not have any conflict of 


interest. 


REFERENCES 


[1] Anton Patyuchenko (2019), Medical Image Processing: 
From Formation to Interpretation, pp 3-4. Available from: 
https://www.analog.com/media/en/technical-documentat 
ioN/tech-articles/Medical-Image-ProcessingFrom-Format 
ion-to-I terpretation.pdf 

[2] Ackland, P., Resnikoff, S., &Bourne, R. (2017). World 
blindness and visual impairment: despite many 
successes, the problem is growing. Community eye 
health, 30(100), 71-73. 

[3] Bhagat, A. P. and Atique, M. (2012) Medical images: 
Formats, compression techniques and DICOM image 
retrieval a survey, 2012 International Conference on 
Devices, Circuits and Systems (ICDCS). IEEE. 
DOI:10.1109/icdesyst.2012.6188698. 

[4] Bhure, A. (2021, A Review of Optical Character 
Recognition (OCR) in Healthcare, International Journal 
for Research in Applied Science and Engineering 
Technology. International Journal for Research in 


Applied Science and Engineering Technology (IJRASET). 
DOI:10.22214/ijraset.2021.34142. 

[5] Dash, S., Shakyawar, S. K., Sharma, M. and Kaushik, S. 
(2019, June 19) Big data in healthcare: management, 
analysis and future prospects, Journal of Big Data. 
Springer Science and Business Media LLC. 
DOI:10.1186/s40537-019-0217-0. 

[6] Huang, L.-C., Chu, H.-C., Lien, C.-Y., Hsiao, C.-H. and 
Kao, T. (2009) Privacy preservation and information 
security protection for patients’ portable electronic health 
records, Computers in Biology and Medicine, 39 (9), pp. 
743-750. DOI:10.1016/j.compbiomed.2009.06.004. 

[7] Kohli, M. D., Summers, R. M. and Geis, J. R. (2017, May 
17) Medical Image Data and Datasets in the Era of 
Machine Learning— Whitepaper from the 2016 C-MIMI 
Meeting DatasetSession, Journal of Digital Imaging. 
Springer Science and Business Media LLC. 
DOI:10.1007/s10278-017-9976-3. 

[8] Li, M., Poovendran, R. and Narayanan, S. (2005) 
Protecting patient privacy against unauthorized release of 
medical images in a group communication environment, 
Computerized Medical Imaging and Graphics. Elsevier 
BV. DOL:10.1016/j.compmedimag.2005.02.003. 

[9] Li, X, Hu, G., Teng, X., &Xie, G. (2015). Building 
Structured Personal Health Records from Photographs of 
Printed Medical Records. AMIA ... Annual Symposium 
proceedings. AMIA Symposium, 2015, 833- 842. 

[10] Monteiro, E., Costa, C. and Oliveira, J. L. (2017). A 
De-Identification Pipeline for Ultrasound Medical Images 
in DICOM Format. Journal of Medical Systems, 41(5). 
Available from: 
http://dx.doi.org/10.1007/s10916-017-0736-1 

[11]Hye-Jeong Song1,2, Byeong-Cheol Jo1,2, Chan-Young 
Park1,2, Jong-Dae Kim1,2 and Yu-Seop Kim1,2*(2016). 
Comparison of named entity recognition methodologies 
in biomedical documents. From International Conference 
on Biomedical Engineering Innovation (ICBEI) 2016 
Taichung, Taiwan. 
https://doi.org/10.1186/s12938-018-0573-6 


aaa ooo oo 
42 International Journal for Modern Trends in Science and Technology 


